Optimizing Hierarchical Storage Management For Database System

نویسنده

  • Xin Liu
چکیده

Caching is a classical but effective way to improve system performance. To improve system performance, servers, such as database servers and storage servers, contain significant amounts of memory that act as a fast cache. Meanwhile, as new storage devices such as flash-based solid state drives (SSDs) are added to storage systems over time, using the memory cache is not the only way to improve system performance. In this thesis, we address the problems of how to manage the cache of a storage server and how to utilize the SSD in a hybrid storage system. Traditional caching policies are known to perform poorly for storage server caches. One promising approach to solving this problem is to use hints from the storage clients to manage the storage server cache. Previous hinting approaches are ad hoc, in that a predefined reaction to specific types of hints is hard-coded into the caching policy. With ad hoc approaches, it is difficult to ensure that the best hints are being used, and it is difficult to accommodate multiple types of hints and multiple client applications. In this thesis, we propose CLient-Informed Caching (CLIC), a generic hint-based technique for managing storage server caches. CLIC automatically interprets hints generated by storage clients and translates them into a server caching policy. It does this without explicit knowledge of the application-specific hint semantics. We demonstrate using trace-based simulation of database workloads that CLIC outperforms hint-oblivious and state-of-theart hint-aware caching policies. We also demonstrate that the space required to track and interpret hints is small. SSDs are becoming a part of the storage system. Adding SSD to a storage system not only raises the question of how to manage the SSD, but also raises the question of whether current buffer pool algorithms will still work effectively. We are interested in the use of hybrid storage systems, consisting of SSDs and hard disk drives (HDD), for database management. We present cost-aware replacement algorithms for both the DBMS buffer pool and the SSD. These algorithms are aware of the different I/O performance of HDD and SSD. In such a hybrid storage system, the physical access pattern to the SSD depends on the management of the DBMS buffer pool. We studied the impact of the buffer pool caching policies on the access patterns of the SSD. Based on these studies, we designed a caching policy to effectively manage the SSD. We implemented these algorithms in MySQL’s InnoDB storage engine and used the TPC-C workload to demonstrate that these cost-aware algorithms outperform previous algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Smart Hierarchical Storage Support for Large-Scale Multidimensional Array Database Management Systems

Large-scale scientific experiments or simulation programs often generate large amounts of multidimensional data. Data volume may reach hundreds of terabytes (up to petabytes). In the present and the near future, the only practicable way for storing such large volumes of multidimensional data is tertiary storage systems. But commercial (multidimensional) database systems are optimized for perfor...

متن کامل

Hierarchical Storage Support and Management for Large-Scale Multidimensional Array Database Management Systems

Large-scale scientific experiments or simulation programs often generate large amounts of multidimensional data. Data volume may reach hundreds of terabytes (up to petabytes). In the present and the near future, the only practicable way for storing such large volumes of multidimensional data are tertiary storage systems. But commercial (multidimensional) database systems are optimized for perfo...

متن کامل

Optimizing a Hierarchical Hub Covering Problem with Mandatory Dispersion of Central Hubs

The hierarchical hub location problem is encountered three-level network that is applied in production-distribution system, education system, emergency medical services, telecommunication network, etc. This paper addresses the hierarchical hub covering problem with single assignment accounting for mandatory dispersion of central hubs restriction as a special case. This formulation with incorpor...

متن کامل

StorHouse/Relational Manager (RM) - Active Storage Hierarchy Database System and Applications

This paper describes how database systems can use and exploit a cost-effective active storage hierarchy. By active storage hierarchy we mean a database system that uses all storage media (i.e. optical, tape, and disk) to store and retrieve data and not just disk. We describe and emphasize the active part, whereby all storage types are used to store raw data that is converted to strategic busine...

متن کامل

SLEDs: Storage Latency Estimation Descriptors

Managing the latency of storage systems is a key to creating effective very large scale information systems, such as web interfaces to satellite image databases and video-on-demand servers. Storage Latency Estimation Descriptors (SLEDs) are architecture-independent descriptions of the retrieval time of a unit of data. They describe the latency to the first byte, and the bandwidth expected. SLED...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014